Integrating information theory and adversarial learning for cross-modal retrieval

نویسندگان

چکیده

Accurately matching visual and textual data in cross-modal retrieval has been widely studied the multimedia community. To address these challenges posited by heterogeneity gap semantic gap, we propose integrating Shannon information theory adversarial learning. In terms of integrate modality classification entropy maximization adversarially. For this purpose, a classifier (as discriminator) is built to distinguish text image modalities according their different statistical properties. This discriminator uses its output probabilities compute entropy, which measures uncertainty it performs. Moreover, feature encoders generator) project uni-modal features into commonly shared space attempt fool maximizing entropy. Thus, gradually reduces distribution discrepancy features, thereby achieving domain confusion state where cannot classify two confidently. reduce Kullback-Leibler (KL) divergence bi-directional triplet loss are used associate intra- inter-modality similarity between space. Furthermore, regularization term based on KL-divergence with temperature scaling calibrate biased label caused imbalance issue. Extensive experiments four deep models benchmarks conducted demonstrate effectiveness proposed approach.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MHTN: Modal-adversarial Hybrid Transfer Network for Cross-modal Retrieval

Cross-modal retrieval has drawn wide interest for retrieval across different modalities of data (such as text, image, video, audio and 3D model). However, existing methods based on deep neural network (DNN) often face the challenge of insufficient cross-modal training data, which limits the training effectiveness and easily leads to overfitting. Transfer learning is usually adopted for relievin...

متن کامل

Cross-Modal Manifold Learning for Cross-modal Retrieval

This paper presents a new scalable algorithm for cross-modal similarity preserving retrieval in a learnt manifold space. Unlike existing approaches that compromise between preserving global and local geometries, the proposed technique respects both simultaneously during manifold alignment. The global topologies are maintained by recovering underlying mapping functions in the joint manifold spac...

متن کامل

HashGAN: Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

As the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. Deep-networks-based cross-modal hashing methods are appealing as they can integrate feature learning and hash coding into end-to-end trainable frameworks. However, it is still challenging to find content similarities between different modalities of data due to the heterogenei...

متن کامل

Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval

Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (SSAH) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal ...

متن کامل

Integrating Textual and Visual Information for Cross-Language Image Retrieval

This paper explores the integration of textual and visual information for cross-language image retrieval. An approach which automatically transforms textual queries into visual representations is proposed. The relationships between text and images are mined. We employ the mined relationships to construct visual queries from textual ones. The retrieval results of textual and visual queries are c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2021

ISSN: ['1873-5142', '0031-3203']

DOI: https://doi.org/10.1016/j.patcog.2021.107983